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1.   INTRODUCTION 

In  this  paper  we  review  applications  of  models  for  univariate 
and  multivariate  series  of  events  (point  processes)  in  computer 
systems  and  statistical  methods  for  the  analysis  of  these  point 
processes.   There  are  many  examples  of  such  series  of  events.  Typi- 
cal examples  include  the  following: 

(i)   occurrences  of  system  failure.   These  events  may  be  typed  as 
"hardware"  failures  or  "software"  failures,  giving  a  bivari- 
atu  point  process.   They  may  also  be  typed  by  the  physical 
part  of  the  system  in  which  the  failure  occurred; 
(ii)   arrivals  of  requests  to  a  storage  subsystem.   These  events 
may  be  marked  by  an  identifier  of  the  requested  record; 
(iii)   occurrences  of  exceptions  in  a  system  having  hierarchical 
storage.   These  events  may  be  typed  according  to  the  level 
of  the  hierarchy  at  which  required  information  is  found. 
The  applications  to  computer  system  problems  of  point  processes 
methodology  have  been  to  computer  system  reliability  and  to  computer 
system  performance  evaluation.   In  Section  2  we  review  these  appli- 
cations, with  emphasis  on  the  very  broad  area  of  performance 
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evaluation.   What  emerges  is  the  need  for  new  data  analytic  methods 
for  the  particular  problems  encountered,  and  the  need  for  simple 
models  for  positive  multivariate  time  series  when  the  data  analysis 

is  used  to  suggest  models  or  modify  postulated  models.   Some  recent 
results  in  the  development  of  methodology  for  statistical  analysis 
of  point  processes  are  described  in  Section  3.   A  final  section 
indicates  some  starts  on  the  development  of  some  new  and  particularly 
suitable  models  for  non-normal  time  series. 

2 .   POINT  PROCESSES  OCCURRING  IN  COMPUTER  SYSTEMS 

2.1.   Computer  System  Reliability 

We  describe  first  an  application  of  point  process  methodology 
in  computer  system  reliability  studies,  namely  the  analysis  and 
modelling  of  computer  failure  patterns. 

2.1.1.   Computer  system  failure  patterns.   The  earliest  application 
of  point  process  methodology  to  problems  associated  with  computer 
systems  is  the  analysis  and  modelling  of  computer  failure  patterns 
given  by  Lewis  (1964a).   A  primary  motivation  for  the  construction 
of  computer  failure  models  is  to  analyze  data  from  operational 
systems  and  to  find  ways  of  comparing  and  perhaps  improving  the 
reliability  of  existing  systems.   Prediction  and  optimization  of 
the  reliability  of  future  systems  is  a  second  motivation. 

Reliability  models  for  complex  systems  which  were  current  in 
1964  predict  that  the  failure  pattern  of  a  computer  system  should 
form  a  Poisson  process.   This  model  is  derived  from  the  assumption 
that  the  failures  in  each  component  position  constitute  independent 
renewal  processes.   The  failure  pattern  of  the  system  is  then  formed 
from  the  pooled  failures  of  the  components.   Under  the  stochastic 
assumptions  that  are  made,  it  is  known  (Cox  and  Smith,  1954)  that 
the  pooled  series  of  events  will  be  indistinguishable  from  a  Poisson 
process  over  periods  of  time  which  are  short  compared  with  the  mean 
times-to-failure  of  the  components.   The  data  presented  by  Lewis 
(1964a)  show,  however,  that  the  t imes-between-failures  of  large 


computer  systems  are  not  exponentially  distributed  and  that  success- 
ive times-between-f allures  are  correlated.   Physically  the  observed 
clustering  of  failures  and  the  resulting  departures  from  a  Poisson 
process  arise  from  imperfect  repair,  i.e.,  because  failed  components 
are  not  always  located  and  removed  the  first  time  they  cause  system 
failure,  nor  are  the  failed  components  always  needed  for  correct 
system  operation.   Subsidiary  system  failures  are  then  induced  a 
short  time  later. 

The  Lewis  (1964a)  paper  deals  with  the  development  of  a  model 
(the  branching  Poisson  process)  for  computer  failure  patterns  which 
accounts  for  the  observed  departures  from  a  Poisson  process.  The 
probabilistic  properties  of  the  model  are  derived  and  used  to 
analyse  three  series  of  computer  failures  (consisting  of  109,  186, 
and  255  events  respectively).   Lewis  (1964b)  discusses  implications 
of  the  model  for  the  use  and  maintenance  of  computer  systems. 

We  describe  the  branching  Poisson  process  model  briefly;  in 
the  model  times  between  original  failures  of  components  constitute 
a  main  process   {X.}.   At  each  point  of  this  process  an  attempt  is 
made  to  locate  and  repair  the  failure,  the  attempt  succeeding  with 
fixed  probability,  independently  of  other  attempts  to  repair  main 
failures.   Otherwise  the  failure  recurs  at  times   Y  ,  Y  +  Y„ ,  ...  , 
Y   +  •••  +  Y.     after  the  initial  occurrence.   Thus,  S+l  unsuccessful 
attempts  are  made  in  all  to  locate  and  remove  the  source  of  the 
computer  failure.   The  computer  failure  pattern  is  then  the  super- 
position of  the  events  in  the  main  process  and  the  events  in  the 
subsidiary  processes  which  the  main  events  generate.   The   {X.}   are 
assumed  to  be  independent,  identically  and  exponentially  distributed 

and  the  intervals   Y.   in  the  subsidiary  processes  are  assumed  to  be 
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mutually  independent  and  identically  distributed.   The  branching 
Poisson  process  model  is  also  called  the  Barlet t-Lewis  cluster  pro- 
cess or  a  Poisson  cluster  process.   Properties  of  the  model  have 
been  developed  by  several  authors;  see  e.g.,  Lewis  (1969),  Oakes 
(1975). 


The  result  of  the  statistical  analysis  given  by  Lewis  (1964a) 
is  the  demonstration  that  the  differing  rates  of  failure  in  the  three 
systems  under  study  is  due  to  the  adequacy  or  inadequacy  of  the  main- 
tenance on  each  system.   This  was  done  by  estimating  (rather  crudely) 
the  rate  of  failures  in  the  main  Poisson  process  and  the  expected 
number,  E(S+1),  of  unsuccessful  attempts  to  fix  an  original  failure. 
Then   E(S+1)   measures  the  adequacy  of  the  maintenance  of  the  system. 
We  know  of  no  advance  in  the  statistical  analysis  of  Poisson  cluster 
processes  since  the  Lewis  (1964a)  paper. 

2.2 .   Computer  System  Performance  Evaluation 

Because  of  the  complexity  of  existing  and  proposed  computer 
systems,  detailed  measurements  of  running  systems  are  needed  in  order 
to  develop  system  models.   This  measurement  and  modelling  comprises 
just  one  facet  of  computer  system  performance  evaluation.   Ultimate 
goals  of  performance  evaluation  include  tuning  of  existing  systems 
and  prediction  (usually  via  simulations)  of  the  performance  of  pro- 
posed systems.   For  example,  it  is  desirable  to  have  an  airline 
reservation  system  which  is  efficient  from  both  the  customers'  and 
airline's  points  of  view  in  the  sense  that  it  should  respond  quickly 
and  reliably  at  a  reasonable  cost. 

Given  the  complexity  of  computer  systems  and  the  resulting 
relative  difficulty  of  carrying  out  meaningful  performance  evalua- 
tions and  designs,  the  collection  and  analysis  of  measurement  data 
from  representative  systems  to  identify  and  characterize  significant 
performance  phenomena  is  necessary.   The  availability  of  such  meas- 
urements presents  the  possibility  of  obtaining  thereby  empirically 
valid,  parameterized  mathematical  models  for  the  workload  of  the 
system.   For  performance  evaluation  studies,  in  addition  to  workload 
or  program  behavior  models,  the  analyst  needs  a  model  for  the  com- 
puter system  or  subsystem  structure  and  frequently  uses  a  network  of 
queues  as  a  system  model.   Such  networks  provide  a  convenient  means 
of  representing  the  interaction  between  the  processing  and  input- 
output  resources  of  (multiprogrammed)  computer  systems  and  sub- 
systems.  There  is  a  large  literature  dealing  with  queueing  network 
models;  see  e.g.,  Gaver  (1967),  Lewis  and  Shedler  (1971),  Buzen 
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(1971),  Moore  (1971),  and  Lavenberg  and  Shedler  (19/6).   Under  the 
usual  convenient,  but  not  necessarily  realistic,  queueing-theorotic 
assumptions  (e.g.,  independent  and  identically   often  exponentially, 
distributed  .service  times)  analyses  of  queueing  network  models 
based  on  a  "nunibers-in-queue"  state  space  can  be  carried  out;  see 
Jackson  (1963),  Basket t,  Chandy,  Muntz  and  Palacios  (1975),  Reiser 
and  Kobayashi  (1975),  Kelly  (1975),  (1976)  and  Gelenbe  and  Muntz 
(1976) .   These  analyses  yield  expressions  for  stationary  queue  length 
distributions  that  can  be  evaluated  numerically  to  provide  measures 
of  system  performance  such  as  device  "utilizations"  and  job  "through- 
put."  All  these  models  use  global  assumptions  of  independent;  measure- 
ments usually  indicate,  for  example,  that  interarrival  times  are 
correlated  and  incorporating  this  dependence  in  the  model  will  some- 
times give  very  different  results;  see,  for  example,  Jacobs  (1977). 

Other  measures  of  system  performance  (calculated  as  sums  of 
queueing  times)  involve  the  distribution  of  times  for  a  job  to 
traverse  a  portion  of  the  network.   These  times  (in  closed  networks 
complete  circuits  or  loops,  and  in  open  networks  times  from  source 
to  sink)  are  often  interpretable  as  job  "response  times"  and  these 
response  times  are  likely  to  be  particularly  sensitive  to  workload 
characteristics.   Analyses  based  on  the  numbers-in-queue  state 
space  yield  expected  values  for  response  times,  but  do  not  yield 
other  characteristics  of  interest  such  as  percentiles  or  quantiles. 
Since  alternative  analyses  to  provide  these  characteristics  are 
in  general  not  available,  it  is  necessary  to  undertake  simulation 
studies  of  the  queueing  networks.   Such  simulations,  and  indeed 
simultations  of  more  complex  queueing  networks  under  more  realistic 
stochastic  assumptions,  are  inherently  difficult,  and  are  likely  to 
be  time-consuming  and  costly  to  carry  out. 

We  describe  three  applications  of  point  process  methodology 
in  computer  system  performance  evaluation:   analysis  of  page  excep- 
tions in  a  two-level  memory,  analysis  of  exceptions  in  a  three- 
level  storage  hierarchy,  and  analysis  of  transaction  processing  in 
a  data  base  management  system. 


2.2.1.   Page  exceptions  in  a  two-level  memory .   The  following  brief 
description  provides  the  computer  system  context  for  the  discussion 
of  page  exception  processes  given  below.  We  consider  a  system  whose 
memory  resource  (for  storage  of  information)  comprises  a  main  memory 
and  an  auxiliary  memory,  and  assume  that  main  memory  is  the  execution 
store,  i.e.,  only  information  that  is  resident  therein  can  be  pro- 
cessed. We  also  assume  that  the  auxiliary  memory  is  large  enough  to 
hold  all  information  required  by  a  program  which  is  to  be  processed. 
When  such  a  system  operates  in  a  so-called  paging  environment, 
units  of  equal  size  called  pages  partition  all  of  the  information 
that  is  explicitly  addressable  by  the  single  processor  (central 
processing  unit).   Similarly,  page-size  sections  called  page-frames 
partition  main  memory.   It  is  possible  to  execute  a  program  by 
supplying  it  with  only  a  few  page-frames  of  main  memory,  as  follows. 
When  the  page  containing  the  first  executable  instruction  has  been 
loaded  into  some  page  frame,  execution  begins  and  continues  until 
the  program  requires  some  information  not  found  in  main  memory.  The 
operating  system  fetches  the  page  containing  the  missing  information 
from  auxiliary  memory  (overwriting  some  page  currently  in  main 
memory),  and  execution  of  the  program  continues,  and  so  forth.   In 
demand  paging,  information  enters  main  memory  only  as  a  result  of 
an  attempt  (detected  by  the  system  hardware)  to  use  information  not 
currently  in  main  memory.   A  page  exception  is  an  instance  of  this 
implicit  "demand"  for  a  page  which  is  not  in  main  memory.   When 
dealing  with  large  programs  or  in  a  multiprogramming  mode  in  which 
main  memory  is  shared  among  several  programs  it  is  usually  the 
case  that  main  memory  is  filled  when  the  system  must  fetch  another 
page  from  auxiliary  memory.   Consequently,  it  is  necessary  to  choose 
a  page  frame  in  main  memory  to  be  overwritten.   The  replacement 
algorithm  is  the  rule  governing  this  choice.   Most  of  the  time, 
before  overwriting  the  chosen  page  frame,  the  system  must  save  the 
content  of  the  page  frame.   (See  Lewis  and  Shedler  (1971)  for  an 
analysis  of  aspects  of  resource  contention  in  multiprogrammed 
demand  paging  systems.) 


The  frequency  and  pattern  of  page  exceptions  strongly  influ- 
ences the  performance  (in  fact,  the  feasibility)  of  a  demand  paging 
system.   Accord ing Ly ,  the  study  of  page  reference  patterns  and  page 
exceptions  (as  a  function  of  page  size,  main  memory  capacity,  and 
replacement  algorithm)  is  of  interest  to  the  system  designer  who 
must  determine  pertinent  system  resources  and  select  system  control 
algorithms  and  parameters,   in  particular  he  would  like  to  choose 
a  page  size,  replacement  algorithm  and  main  memory  capacity  so  as 
(in  uniprogrammed  mode)  to  minimize  the  (long-run  average)  page 
exception  rate.  Thus  it  would  be  useful  to  know  the  stochastic 
structure  of  the  reference  process. 

We  can  study  several  related  stochastic  processes  iu  order  to 
characterize  page  reference  patterns: 

(i)   reference  strings   {R.l,  i.e.,  sequences  of  page  references, 
where   R.   is  the  page  referenced  by  the  program  at  time  i. 
We  can  think  of  these  as  a  multivariate  point  process  (Cox 
and  Lewis,  1972)  in  discrete  time,  the  multivariate  aspect 
being  that  the  events  (references  to  a  page)  which  occur  at 
each  time  instant  i  are  of  several  types  (different  pages); 
(ii)   distance  strings   (D.},  e.g.,  sequences  of  stack  distances 
for  least  recently  used  (LRU)  replacement,  as  defined  in 
Mattson,  Gecsei,  Slutz  and  Traiger  (1970),  where   D.   is  the 
total  number  of  distinct  pages  referenced  since  the  last 
reference  to   R . ; 
(iii)   the  point  processes  corresponding  to  page  exceptions  for 

various  (fixed)  main  memory  capacities   c,  i.e.  (discrete) 
times  i  at  which   D.   exceeds  the  main  memory  capacity. 
Denote  this  process  by   {T . (c) ; j=l ,2 , . . . } ,  where   T.(c)   is 
the  time  of  the  j th  page  exception  in  memory  of  capacity   c. 
As   c   increases,  fewer  page  exceptions  (under  LRU  replace- 
ment) occur;  does  this  thinning  result  in  a  Poisson  process 
when   c   is  large? 
It  is  not  necessarily  simple  to  go  (probabilistically)  from 
one  of  these  representations  to  another.   It  may,  of  course,  be 


that  one  of  the  representations  is  more  convenient  than  any  of  the 
others  in  a  particular  application.  The  distance  string  represen- 
tation \D.\  suppresses  page  names,  which  may  be  advantageous  in 
that  the  process  should  be  more  nearly  stationary  than  the  process 

{R.  \. 
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The  Lewis  and  Shedler  (1973)  paper  describes  analysis  and 
modelling  of  page  exceptions  as  a  univariate,  unmarked  point  process. 
The  available  data  consists  of  the  reference  string   (R.),  a  sequence 
of  approximately  8.8  million  references  to  some  517  distinct  4,096 
(4K)  byte  pages  for  a  particular  (relatively  small)  program.   From 
the  reference  string,  the  distance  string  was  derived,  and  in  turn 
the  sequence  of  times  (in  number  of  references)  between  page  excep- 
tions for  each  of  several  main  memory  capacities.   For  the  smallest 
main  memory  capacity,  some  1807  page  exceptions  occur;  for  the 
largest  capacity,  517  page  exceptions  occur.   The  voluminous  nature 
of  this  data  is  characteristic  of  computer  system  data. 

The  initial  basis  for  the  analysis  and  modelling  is  available 
theory  on  rare  events  and  thinning  of  point  processes  (Daley  and 
Vere-Jones,  1972,  Section  5.3)  suggesting  that  these  relatively 
rare  events  (page  exceptions)  should  form  approximately  a  Poisson 
process.   An  analysis  of  the  data  was  undertaken  to  confirm  or  re- 
ject and  extend  the  model.   The  analysis  shows  quickly  that  the 
Poisson  model  for  page  exceptions  is  grossly  inadequate.   A  direct 
examination  of  the  distance  string,  as  well  as  a  spectral  analysis, 
indicates  the  presence  of  an  alternation  or  two-state  phenomenon 
(a  consequence  of  so-called  locality  of  reference),  and  on  this 
basis  a  two-state  univariate  semi-Markov  generated  point  process 
model  (Cox,  1963,  Cox  and  Lewis,  1966,  Ch .  7)  for  the  process  of 
page  exceptions  is  formulated  and  found  to  characterize  the  data 
adequately.   Unfortunately  fitting  this  model  to  the  voluminous 
available  data  is  not  simple,  partly  because  the  marginal  distri- 
bution of  times  between  page  exceptions  is  a  mixture  of  a  discrete 
random  variable  and  a  very  skewed  continuous  random  variable;  see 
Figure  1.   We  feel  that  some  of  the  new  models  described  in  the 
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section  of  this  paper  are  more  appropriate  for  this  data  than  the 
univariate  semi -Markov  generated  point  process. 

2.2.2.   Exceptions  in  a  three-level  storage  hierarchy.   A  demand- 
paged  computer  system  may  have  a  more  general  hierarchy  of  storage 
than  the  two-level  structure  described  above.   In  particular,  it 
may  also  have  a  very  large  store  (a  mass  store)  for  data  which  is, 
hopefully,  infrequently  needed.   An  analysis  of  exceptions  in  such 
a  three-level  storage  hierarchy,  managed  under  LRU  (least  recently 
used)  replacement  and  in  which  data  is  staged  between  intermediate 
levels  (cf.  Slutz  and  Traiger,  1972),  is  given  by  Gaver,  Lewis  and 
Shedler  (1974).   Units  of  equal  size  called  blocks ,  each  of  which 
is  further  divided  into  units  of  equal  size  called  pages ,  partition 
the  explicitly  addressable  information  in  the  system.   Similarly, 
page  size  sections  called  page-frames  partition  level  1  of  the 
storage  hierarchy  (the  execution  store),  and  block-size  sections 
called  block-frames  partition  levels  2  and  3  of  the  hierarchy;  see 
Figure  2.   Two  types  of  exceptions  occur  here,  "hits  to  level  2" 
and  "hits  to  level  3,"  respectively. 

In  a  hierarchy  such  as  this  encountered  in  practice,  the 
number  of  hits  to  level  2  is  typically  several  orders  of  magnitude 
larger  than  the  number  of  hits  to  level  3.   In  the  Gaver,  Lewis 
and  Shedler  (1974)  paper  the  available  trace  data  consists  of  some 
34.7  million  references  to  166  distinct  32,768  (32K)  byte  blocks; 
there  are  several  hundred  hits  to  level  3,  but  hundreds  of  thousands 
of  hits  to  level  2  for  given  pairs  of  capacities  of  levels  1  and  2 
of  the  storage  hierarchy.   Thus,  a  complete  description  of  the  bi- 
variate  exception  process  consists  of  hundreds  of  thousands  of 
interval  and  point-type  pairs. 

Such  a  voluminous  amount  of  data  is  not  only  difficult  to 
comprehend,  it  is  also  expensive  to  manipulate.   As  a  result,  the 
statistical  analysis  and  modelling  is  based  on  sequences   {Y.} 
and   (N(Y.)( — respectively  the  intervals  between  successive  hits 
to  level  3,  and  the  counts  of  hits  to  level  2  between  successive 
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FIG.    2. 
Staging   of    data    in    a    three-level    storage   hierarchy 
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hits  to  level  3.   The  .starting  point  of  the  analysis  is  a  set  of 
scatter  diagrams  of  points  in  the   fY,N(Y)]   plane.    These  scatter 
diagrams  reveal  the  apparent  existence  of  two  distinct  kinds  of 
referencing  hehavior.   The  capacities  of  levels  1  and  2  of  the 
storage  hierarchy,  in  pages  and  blocks,  respectively,  are  denoted 
by   c.   and   c9;  together  with   b    and   b  ,  respectively,  the  page 
size  and  block  size  in  bytes,  these  are  the  basic  hierarchy  parameters 
For  each  pair  of  capacities   c   and   c»   a  striking  two-line  rela- 
tionship is  observed  in  the  graphical  display;  see  Figure  3.   Points 
in  the   [Y,N(Y)]   plane  appear  to  be  of  two  types,  and  for  each  of 
the  two  types,  points  of  that  type  cluster  about  a  straight  line 
(through  the  point   Y  =  1,  N(Y)  =  0),   The  analysis  and  modelling 
in  the  Gaver,  Lewis  and  Shedler  (1974)  paper  proceeds  from  this 
observed  double-linearity. 

2.2.3.   Transaction  processing  in  a  data  base  management  system.   A 
data  base  management  system  provides  access  for  many  users  to  a 
(typically  very  large)  data  base  managed  by  a  computer  system.   Air- 
line reservations  systems,  banking  and  inventory  systems  are  familiar 
examples.   Such  a  system  should  respond  to  a  query  in  a  reasonably 
short  time,  given  the  number  of  users  and  the  nature  of  the  user 
environment.   This  should  also  be  accomplished  as  economically  as 
possible,  taking  into  account  direct  customer  (waiting)  costs  and 
computer  system  resource  utilization.   These  two  criteria,  fast 
response  and  low  cost,  are  in  general  antithetical. 

The  Lewis  and  Shedler  (1976)  paper  deals  with  methods  for  the 
examination  of  nonstationary  univariate  point  processes  which  can 
be  applied  to  obtain  a  graphical  and  mathematical  description  of 
the  behavior  of  a  running  data  base  management  system.   Such  a 
description  provides  a  useful  starting  point  for  studies  aimed  at 
workload  characterization,  a  central  problem  in  performance  evalu- 
ation of  data  base  management  systems.   Stochastic  models  of  the 
kind  obtained  by  Lewis  and  Shedler  (1976)  have  application  to  the 
detailing  of  proposed  (e.g.,  queueing  network)  system  models  and 
to  the  validation  of  such  system  models. 
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Scatter  diagram  for  bivariate  exception  process 
in  a  three-level  storage  hierarchy;  from  Gaver, 
Lewis  and  Shelder  (1974). 
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The  analysis  by  Lewis  and  Shedler  (1976)  is  of  data  obtained 
from  an  IMS  (Information  Management  System)  data  base  management 
system.   IMS  (IBM  Corp.,  1973)  is  a  processing  program  for  the 
implementation  of  large  data  bases  accessed  in  common  by  several 
applications.   The  IMS  program  executes  under  the  operating  system 
of  the  computer  system,  and  users  can  access  the  total  data  base 
from  remote  terminals  by  entering  messages  called  transactions . 
A  particular  transaction  uses,  and  thus  uniquely  identifies,  an 
application  program  which  processes  the  message  (transaction)  and 
accesses  the  data  base.   Data  Language/I  (DL/I)  is  the  data  manage- 
ment facility  of  IMS.   The  execution  of  an  application  program  gives 
rise  to  a  sequence  of  calls  to  the  DL/I  component  of  IMS.   In  a 
computer  system  running  IMS,  the  operating  system  occupies  a  portion 
of  memory.   The  IMS  program  also  occupies  a  portion  of  memory. 
Application  programs  reside  in  secondary  storage  in  an  application 
program  library.   For  execution  an  application  program  must  be 
loaded  into  one  of  several  (typically  three  or  four)  regions  in 
memory  called  message  processing  regions.   The  data  base  resides 
in  secondary  storage,  and  in  response  to  transaction  initiations, 
data  enter  memory  for  processing. 

The  data  on  the  processing  of  transactions  in  the  Lewis  and 
Shedler  (1976)  paper  was  obtained  from  a  computer  system  running 
IMS  for  production  control;  entry  of  data  into  this  system  is  on- 
line and  governed  by  the  occurrence  of  events  in  the  production  line, 
The  epochs  of  time  at  which  individual  DL/I  calls  were  completed 
(i.e.,  control  returned  to  the  application  program)  were  recorded, 
along  with  information  sufficient  to  identify  the  initiation  times 
of  individual  transactions. 

In  analyzing  the  transaction  initiation  data,  there  are  a 
number  of  prior  assumptions  that  can  be  made  about  the  data  to 
serve  as  a  starting  point.   The  purpose  of  the  data  analysis  is 
to  confirm  these  assumptions  or  to  point  to  suitable  modifications. 
Since  the  data  was  taken  over  six  whole  days  (typically  some 
25,000  transactions  per  day),  a  time-of-day  effect  would  be  expected 
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as  activity  builds  up  through  the  working  day  and  then  declines 
during  the  evening.   Thus  any  kind  of  initial  analysis  based  on  an 
assumption  of  stationarity  is  inappropriate.   The  usual  null  model 
is  a  non-homogeneous  Poisson  process;  this  could  be  reasonable 
since  the  transaction  initiation  process  is  a  superposition  of  in- 
puts from  a  number  of  sources  (users).   Because  each  user's  activity 
is  likely  to  consist  of  a  (random)  number  of  transactions  after 
initial  sign-on,  some  clustering  in  the  data  might  be  expected.  An 
appropriate  model  here  is  the  non -homogeneous  branching  Poisson 
process  or  Poisson  cluster  process  (Lewis,  1967).   In  this  process 
an  initial  primary  (main)  event  generates  a  finite  sequence  of 
secondary  (subsidiary)  events;  the  complete  process  is  then  the 
superposition  of  the  primary  and  secondary  events,  where  a  non- 
homogeneous  Poisson  process  generates  the  main  events.   If  there 
are  enough  primary  events  (high-activity)  so  that  the  number  of 
active  secondary  processes  is  large,  the  process  is  hard  to  dis- 
tinguish from  a  non-homogeneous  Poisson  process. 

Starting  from  these  assumptions,  the  analysis  of  the  data 
proceeds  as  follows.   A  very  rough,  model-free  procedure  gives  an 
estimate  of  the  rate  function  for  the  transaction  initiation  process 
over  the  whole  day.   On  the  basis  of  this  trend  analysis,  relatively 
homogeneous  high-  and  low-activity  periods  during  the  day  are  se- 
lected, and  an  attempt  is  made  to  verify  the  non-homogeneous  Poisson 
process  model  or  the  cluster  process  model.   Based  on  this  local 
analysis  and  modelling  of  the  transaction  initiation  process,  more 
formal  model-dependent  procedures  are  applied  to  the  transaction 
rate  function  for  the  several  days.   The  Poisson  assumption  is  found 
to  be  reasonably  valid  for  high-activity  periods;  clustering  becomes 
more  evident  in  low-activity  periods. 

3 .   STATISTICAL  ANALYSIS  OF  POINT  PROCESSES  IN  COMPUTER  SYSTEMS 

3.1 .   The  Nature  of  Computer  System  Data 

There  are  five  important  characteristics  of  data  obtained 
during  the  measurement  phase  of  computer  system  performance  studies. 
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(i)   The  amounts  of  daLa  available  are  staggering;  often  several 
sequences  or  series  of  events  are  observed  for  times  pro- 
ducing millions  of  observations. 

(ii)   The  times-between-events  are  often  overdispersed  relative 
to  an  exponential  distribution,  and  in  many  cases  contain 
discrete  components. 
(iii)   Stationarity  is  often  not  a  reasonable  assumption;  frequently 
when  stationarity  is  reasonable  it  is  because  there  is 
random  switching  back  and  forth  between  several  possibly 
stochastic  modes. 

(iv)   There  are  many  situations  in  which  there  are  fairly  gross 
inhomogeneities  in  the  data;  these  can  usually  be  tied  to 
external  variables  such  as  the  number  of  users  of  the  system, 
or  to  the  time  of  day. 
(v)   Sometimes  the  data  suggests  that  there  may  simply  be  no 

stochastic  regularity  involved.   Of  course,  it  could  be  that, 
in  line  with  (iii),  not  enough  time  is  involved  to  show 
emergent  patterns. 

As  a  consequence  of  (v)  there  is  a  procedural  difficulty. 
Consider  starting  a  rough  exploratory  analysis  of  a  series  of  events 
by  smoothing  the  data  to  obtain  a  graph  of  the  event  rate  over  time. 
Suppose  we  observe  that  over  the  first  100,000  events  the  rate  is 
fairly  constant  and  that  beyond  this  it  changes,  a  phenomenon  which 
can  be  seen  by  eye  and  validated  by  simple  statistical  methods. 
Should  we  analyze  the  data  in  two  parts,  and  if  so,  how  do  we 
characterize  the  process  in  toto?   Should  we  take  more  data  and 
hope  to  distinguish  "all"  the  possible  stochastic  modes? 

On  the  other  end  of  the  scale,  when  we  examine  the  apparently 
stationary  segments  of  the  data,  questions  of  more  microscopic 
stationarity  sometimes  arise.   This  is  reminiscent  of  the  self- 
similarity  concept  for  physical  data  put  forward  by  Mandelbrot  (1967) 

An  unfortunate  consequence  of  all  of  the  above  is  that  we  can 
seldom  sample  the  data  over  time  to  achieve  some  compression.  Even 
in  the  best  of  circumstances,  intelligent  use  of  sampling  techniques 
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requires  wel  1  -formulated  questions;  the  tendency  in  computer  system 
measurement  often  is  to  feel  that  if  an  enormous  amount  of  data  is 
collected,  it  will  be  possible  to  answer  any  question  that  may 
arise ! 

3.2.   Recent  Developments 

Lewis  (1972)  summarizes  some  of  the  recent  developments  in  the 
statistical  analysis  of  point  processes;  see  also  Cox  (1972)  and 
Brown  (1972).   There  is  also  a  book  on  point  processes  by  Snyder 
(1975)  and  a  sequence  of  papers  by  Brillinger  (1975a,  1975b). 
Brillinger  bases  his  work  heavily  on  spectral  methods  for  stationary 
processes  and  his  work  has  many  points  of  contact  with  that  of  Cox 
and  Lewis.   But  Snyder  (1975)  does  not  reference  any  of  Brillinger's 
papers,  and  does  not  reference  Cox  and  Lewis  (1966).   What  then  is 
the  point  of  contact  between  these  lines  of  development  in  the 
analysis  of  series  of  events? 

The  work  of  Cox  and  Lewis  and  that  of  Brillinger  are  fairly 
complementary.   The  former  is  highly  data  analytic  in  the  sense 
that  there  is  concern  with  validation  of  assumptions  and  models  and 
analysis  of  trends;  the  latter  is  concerned  mainly  with  spectral 
methods  for  stationary  (univariate  and  multivariate)  point  processes 
based  on  models  such  as  self-exciting  point  processes  (Hawkes,  1972, 
Hawkes  and  Oakes,  1974)  which  lend  themselves  easily  to  spectral 
methods.   Snyder  (1975)  bases  his  statistical  methods  almost  entirely 
on  likelihood  analysis,  a  strong  assumption  that  the  stochastic 
mechanism  generating  the  series  of  events  is  known  and  that  it  is 
possible  to  write  down  the  "sample-function  density,"   We  have 
doubts  that  this  approach  will  be  useful  in  analyzing  data  from 
computer  systems,  in  particular  since  for  this  type  of  data  there  do 
not  appear  to  be  any  compelling  physical  models  other  than,  in  some 
cases,  Poisson  and  Poisson  cluster  processes. 

Consider  now  some  specific  areas  of  development. 
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3.2.1.   Trend  analysis  and  clet rending  of  point  processes.   In  many 
fields  of  application,  and  particularly  in  computer  system  data,  it 
has  become  increasingly  apparent  that  stationary  point  processes 
are  at  best  a  convenient  mathematical  fiction.   Most  data  exhibit 
fairly  subtle  trends  and  methods  for  testing  for  these  trends  are 
known  (Cox  and  Lewis,  1966,  Ch .  3);  other  data,  however,  exhibit 
gross  trends,  e.g.,  time-of-day  effects  in  the  series  of  arrivals 
at  a  queue,  and  techniques  for  the  analysis  and  characterization 
of  such  data  are  only  now  beginning  to  be  developed. 

The  situation  is  analogous  to  that  in  ordinary  regression 
analysis  and  time-series  analysis  where  we  might  want,  for  example, 
to  estimate  parameters  in  an  assumed  (linear)  function  for  the  mean, 
test  the  model  for  the  mean  function  and  then  examine  the  model 
which  is  assumed  for  the  residuals.   The  latter  could  include 
examining  the  residuals  to  test  for  independence,  estimating  the 
spectrum  of  the  residuals  and  testing  the  assumed  normality  of  the 
residuals.   Techniques  for  these  problems  in  the  linear  normal  model 
are  known  (see  e.g.,  Hannan ,  1970). 

By  comparison,  in  point  processes  we  might  want  to: 
(i)   estimate  the  rate  function   A(t),  using  either  specific 
functional  models  or  smoothing  techniques; 
(ii)   test  specific  functional  models  for   A(t); 
(iii)   detrend  the  point  process,  examine  the  'residual'  process 

and  test  the  usual  hypothesis  that  the  events  are  generated 
by  a  homogeneous  Poisson  process. 

When  dealing  with  non-homogeneous  Poisson  processes  the  most 
appropriate  detrending  technique  (Lewis,  1970,  1972)  seems  to  be  to 
transform  the  time-scale  so  that  the  ith  event  occurring  at  time 

t.   now  occurs  at  time 

l 

t .  „ 
t .  =  J    A (u)du  , 
1    0 

where   A(t)   is  some  estimate  of  the  rate  function   A(t).   Note  that 
if   A(t)   is  known,  the    {t.I   process  is  a  Poisson  process  of 
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rate  one.   When   \(t)   is  estimated  from  the  data,  the  distribu- 
tional problems  associated  with  determining  properties  of  the   { i  .  } 
process  are  difficult. 

Results  for  estimating  parametric  rate  functions  are  given  by 
Cox  (1972)  and  Lewis  (1972).   These  methods  are  developed  by  Lewis 
and  Shedler  (1976)  and  applied  to  the  statistical  analysis  of 
transaction  processing  in  the  data  base  systems.   It  is  probably 
best  to  deal  with  fairly  regular  point  processes  by  using  log  trans- 
forms of  the  intervals  between  events  and  then  using  ordinary  time- 
series  methods  (Cox  and  Lewis,  1966,  Ch.  3).   It  is  still  difficult 
to  deal  with  non-homogeneous  processes  which  are  overdispersed 
relative  to  a  Poisson  process,  e.g.,  a  non-homogeneous  Poisson 
cluster  process;   some  work  has  been  done  by  Lewis  and  Robinson 
(1974). 

3.2.2.   Spectral  analysis  of  point  processes.   By  spectral  analysis 
of  a  point  process  (Barlett,  1963)  we  mean  the  second-order  spectrum 
of  the  counting  function  N(t)   of  the  point  process  (the  count 
spectrum).   Brillinger  (1972)  has  put  this  spectral  analysis  on  a 
firm  footing  in  the  context  of  a  general  spectral  theory  for  sta- 
tionary interval  functions  such  as   N(t).   He  has  also  proposed 
the  use  of  higher-order  spectra. 

We  can  think  of  the  spectral  analysis  of  a  point  process  as 
an  ordinary  second-order  spectral  analysis  of  a  function   dN(t) 
which  is  a  series  of  delta  functions  occurring  at  random  times   {T . } , 
the  times-to-events ;  see  Lewis  (1970)  for  a  heuristic  interpretation. 
Note  that  the  second-order  count  spectrum  completely  specifies  a 
renewal  process.   This  spectrum,  g  (oj)  ,  is,  in  fact,  essentially 
the  Fourier  transform  of  the  renewal  density  or  the  intensity  func- 
tion.  Note  that  this  spectral  analysis  is  not  a_  second-order 
spectral  analysis  of  the  intervals  between  events   X.  =  T.  -  T.  ... 
The  latter  spectrum, denoted  by   f  (w) ,  is  useful  for  differentiating 
between  renewal  processes  (for  which  it  is  flat)  and  non-renewal 
point  processes.   The  spectrum  of  intervals  may,  in  fact,  be 
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preferable  to  higher-order  spectra  of  counts  (Bril linger,  1972)  in 
that  it  should  exhibit  fewer  sampling  fluctuations;  in  general  our 
feeling  is  that  a  spectral  analysis  of  the  counts  and  the  intervals 
should  be  tried  before  going  to  higher  order  spectra.   The  esti- 
mated spectrum  of  intervals   f  (w)   for  the  page  exception  process 

of  Lewis  and  Shedler  (1973)  is  shown  in  Figure  4.   The  underlying 
spectrum   f  (to)   is  clearly  not  flat  (i.e.  equal  to  2  for  all   cu) 
so  that  the  process  is  neither  a  Poisson  process  nor  a  renewal 
process.   The  spectrum  of  a  mixed  moving  average-autoregressive 
process  (ARMA(1,1)),  where  the  orders  of  the  moving  average 
and  the  autoregression  are  both  one,  fits  the  estimated  spectrum 

well.   We  return  to  this  in  the  next  section;  the  two-state  semi- 
Markov  generated  point  process  model  used  by  Lewis  and  Shedler  (1973) 
and  several  of  the  models  defined  in  Section  4  have  this  spectrum. 
The  fitted  interval  spectrum  is  shown  in  the  figured   The  estimated 
spectrum  of  counts,  along  with  the  fitted  spectrum  of  counts  of 
the  two-state  semi-Markov  generated  point  process  for  the  page  ex- 
ception process  data  of  the  Lewis  and  Shedler  (1973)  paper  are 
shown  in  Figure  5. 

One  drawback  to  the  spectral  analysis  of  point  processes  is 
the  large  amount  of  time  required  for  computation  of  spectral  esti- 
mates.  Only  recently  have  French  and  Holden  (1971),  in  an  important 
paper,  found  a  way  to  use  the  fast  Fourier  transform  (FFT)  in  this 
context.   There  are  some  problems  with  this  technique,  e.g.,  it  is 
not  bias-free  but  it  appears  to  be  of  great  value. 

We  shall  return  to  Brillinger's  higher-order  count  spectra  in 
the  discussion  of  new  models.   In  most  cases  involving  computer 
system  data  there  is  a  problem  in  applying  spectral  techniques 
because  of  lack  of  stat ionarity .   Used  with  care,  however,  spectral 
techniques  can  indicate  a  switching  of  levels  or  some  kind  of  quasi- 
cyclic  behavior  in  a  system  (see  e.g.,  Lewis  and  Shedler,  1973). 

3.2.3.   Multivariate  point  processes.   In  aJmost  all  applications 
in  computer  systems  we  are  interested  in  interactions  between 
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stochastic  processes  occurring  at  different  places  in  space  and  Lime 
We  have  given  some  examples  ,;bove;  additional  examples  arc 
fo L lowing : 

(i)   arrivals  of  requests  to       vera!  spa  of  a  disk 
storage  device; 
(ii)   times  of  occurrence  of  references  to  programs  in 
multi programmed  compute     ,  t  in, 
(iii)   times  of  job  start  and  termination  I  i  a  multi- 
programmed  computer  system      er  and  Sh<  .  1977). 

These  examples  are  by  no  means  exhaustive  but        rate 
important  applications.   We  shall      iss  multivariate  point  pro- 
cesses (Cox  and  Lewis,  1972)  which  we  think  of  as  point  processes 
in  which  qualitative  marks  associated  with  each  event  partition  the 
sets  of  events.   Note  that  Brillinger's  work  (Brillinger,  1972) 
encompasses  more  general  situations.   Cox  and  Lewis  (1966,  Ch .  8, 
1972),  Perkel,  Cerstein  and  Moore  (1966)  and  Brillinger  (1972, 
1975a,  1975b)  discuss  the  analysis  of  multivariate  point  processes, 
and  we  will  make  only  general  comments  here. 

a)  Dependencies  between  two  stationary  processes  are  usually 
handled  via  spectral  methods,  i.e.,  second-order  cross-spectra 
which  when  normalized  give  quantities  called  coherences.   This 
is  Brillinger's  approach.   It  is  not  at  all  clear,  however,  how 
useful  second-order  spectra  are  for  point  processes  (univai  iate 
and  multivariate),  which  are  a  long  way  from  normal  processes 
in  which  the  second-order  spectra  completely  specify  the 
dependence . 

b)  There  is  a  problem  of  specifying  what  kind  of  dependency 
structures  occur  in  multivariate  point  processes.   We  can,  for 
instance,  generate  many  bivariate  Poisson  processes,  i.e.,  bi- 
variate  point  processes  in  which  the  individual  (marginal)  pro- 
cesses are  Poisson.   Cox  and  Lewis  (1972)  give  a  start  at 
examining  these  structures;  also  see  Lawrance  and  Lewis  (1975) 
and  Oakes  (19  76) . 


23 


4 .   MODELS  AND  MODELLING  OF  POINT  PROCESSES  IN _C OMPUT E R  S Y S T EMS 

In  the  previous  section  we  discussed  characteristics  of  point 
process  data  observed  in  computer  systems;  here  we  discuss  some 
aspects  of  modelling  point  processes  of  this  kind.   One  reason  for 
discussing  modelling  in  this  context  is  that  the  pecularities  of 
the  data  have,  in  a  fairly  insistent  way,  led  us  to  develop  the 
new  models  described  below.   Why  do  we  need  new  models,  in  particu- 
lar, to  describe  the  internal  complexities  of  a    computer  system? 
The  physics  and  data  analysis  of  the  computer  reliability  problem 
led  to  an  important  model  (the  Poisson  cluster  process)  which  has 
application  in  many  other  contexts.   There  is,  however,  usually  no 
such  physical  imperative  in  the  internal  computer  processes,  and 
the  data  analysis  typically  reveals  enormous  complexity  which  is 
difficult  to  match  to  characteristics  of  the  usual  point  process 
models  (e.g.,  cluster-processes,  doubly  stochastic  Poisson  processes, 
etc.).   Besides  non-stationarity ,  which  we  ignore  here,  complexity 
of  the  modelling  is  appai  int  from  an  analysis  of  data  on  the  mar- 
ginal distribution  of  times-between-events .   In  the  analysis  of 
page  exceptions  given  by  Lewis  and  Shedler  (1973),  the  marginal 
distribution  is  found  to  be  highly  skewed  and  to  have  a  discrete 
component  (Figure  1) .   None  of  the  common  point  process  models  can 
describe  the  marginal  distribution  of  such  data,  let  alone  its 
dependency  structure. 

Models  for  computer  system  performance  evaluation  have 
the  following  requirements: 
(i)   First,  there  is  a  need  for  descriptive  and  structurally 
simple  point  process  models  analogous  to  the  linear  pro- 
cesses used  in  the  usual  time-series  analyses  (e.g.,  Box- 
Jenkins  techniques) .   These  should  be  easy  to  fit  to  the 
data,  and  simple  to  generate  on  a  computer,  since  the  models 
are  often  used  in  simulation  studies  of  computer  system 
performance . 
(ii)   Second,  there  is  a  need  for  models  in  which  the  marginal 
distribution  of  times-between-events  is  specifiable  in  a 
manner  which  is  as  independent  of  the  specification  of  the 
dependency  structure  of  the  model  as  is  possible. 
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With  regard  to  the  second  point,  the  computer  system  data 
studies  described  in  this  paper  have  made  us  aware  of  the  extent  to 
which  the  usual  point  process  models  are  primarily  concerned  with 
the  dependency  structure  of  the  model.   (The  analog  in  ordinary 
time  series  analysis  is  that  in  defining  linear  models  we  usually 
assume  that  the  random  variables  are  normally  distributed,  or  ignore 
this  aspect  of  the  problem  altogether.)   The  distribution  of  times- 
between-events ,  however,  is  one  of  the  most  easily  observed  aspects 
of  a  point-process,  and  can  be  just  as  informative  as,  say,  the 
spectrum  of  counts.   The  estimated  marginal  distribution  of  the 
page  exception  process  given  in  Figure  1  has  a  discrete  component 
at   x  =  1024;  this  artifact  of  the  data  can  be  related  in  an  in- 
formative way  to  the  paging  process. 

We  describe  now  some  recently  developed  stochastic  sequences 
which  are  useful  as  models  for  point  processes.   We  intend  no  impli- 
cation that  the  constructions  are  unique.   The  sequences  do  have 
properties,  however,  which  make  them  very  useful  in  modelling  point 
processes  in  computer  systems.   In  particular,  the  marginal  distri- 
bution of  the  variables  is  an  integral  part  of  the  specification 
of  the  stochastic  sequence. 

U  . 1 .   Interval  Models. 

Univariate  point  processes  can  be  described  equally  well 

through  the  structure  of  the  intervals  between  events   {X.}   or  the 

l 

counting  process   {N(t)},  where   N(t)   gives  the  number  of  events 
in   (0,t).   We  discuss  the  modelling  of  the  intervals   {X.}   first. 

4.1. L.   The  firsts-order  autoregressive  exponential  model  (EAR1)  .  In 

a  Poisson  process  the  intervals   {X.}   are  independent  and  identi- 

l 

cally  distributed  (i.i.d.)  with  exponential  (A)  distribution 

FY(x)  =  1  -  e"Ax  ,      A  >  0;   x  >  0  .  (4.1) 

A 


Several  attempts  have  been  made  to  generalize  the  Poisson  process 
by  making  the   X.   dependent,  but  with  exponential  or  conditional! 
exponential  marginal  distributions  (Cox,  1955).   The  simplest  and 
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only  really  successful  attempt  in  the  sense  of  broad  applicability 
(Gaver  and  Lewis,  1977),  gives  a  process  called  the  EAR!  model, 
derived  from  the  following  consideration. 

A  first-order  autoregressive  stochastic  sequence  is  defined 
by  the  stochastic  difference  equation 

xi  =  pXi-i  +  S  '     L  =  °.±i.+2,«..;  |p|   1  ,       (4.2) 

where  the   t.   are  assumed  to  be  an  i.i.d.  stationary  random 

sequence.   If  the   t .   are  normally  distributed,  so  are  the   X  . 

i  l 

What  must  the  distribution  of  the   e    be  in  order  for  the   X 

l 

sequence  to  be  stationary  with  an  exponential  (a)  distribution? 
The  answer  is  surprisngly  easy  (Gaver  and  Lewis,  1977). 

Let   0  <  p  <  1,  and   {E  }   be  an  i.i.d.  exponential  (A) 
sequence.   Now  let  e.   be  equal  to  zero  with  probability   p   and 
equal  to   E.   with  probability   1-p.   Then  we  have 

pX.  probability   p 

\   -  j  (A. 3) 

pX    +  E.     probability  (1 


=  PX._1  +  V.E.  ,  (A. 4) 

where   {V.}   is  an  i.i.d.  binary  sequence  with  w .  =  1   with  prob- 
l  l 

ability   (1-p).   Moreover  if  we  let   X   =  E  ,  and  define   X.   as 
in  (4.3),  the  resulting  sequence  is  stationary  for   i  =  0,1,...  . 

The  point  process  with  the  interval  structure  (4.3)  is  called 
the  EAR1  point  process.   It  is  a  tractable  model^  and  most  of  its 
important  properties  are  given  in  Gaver  and  Lewis  (1977).   In 
particular  we  have  that   p(k)  =  p  .   This  model  is  in  a  sense 
degenerate  because  it  contains  runs  of   X.   in  which  values  are 
exactly   p    times  the  previous  value;  it  could,  however,  be  a 
reasonable  model  for  point  processes  observed  in  computer  systems 
(e.g.,  inter-arrival  times  of  requests  to  a  storage  subsystem)  in 
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which  the  intervals  have  exponential  marginal  distributions  but  are 
dependent.    Note  that  as  defined  the  model  can  only  provide  sequen- 
ces  {X.}   with  positive  serial  correlations.   We  can,  however, 
define  the  process  to  include  negative  correlations- 

Simple  generalizations  of  this  Markovian  exponential  process 
are  the  following. 

4.1.2.  The  moving  average  exponential  model  (EMAk).   We  define 
another  stationary  sequence   {X.},  using  the  {  K . }  sequence  above, 
according  to 

XQ  -  EQ  (4.5) 

X.  =  3E.  +  U.E.  .,     i  =  1,...  ;   0  <  B  <  1  ,         (4.6) 
ill  l-l  —   — 

where   {u.}   is  an  i.i.d.  binary  sequence  in  which   U.  =  1   with 
11 

probability   (1-3).   This  is  a  first  order  exponential  moving  aver- 
age process  (EMAl)  (Lawrance  and  Lewis,  1977)  which  is  one-dependent; 
in  particular 

p(l)  =  6(1-6)  (4.7) 

p(k)  =  0  ,        k  =  2,3,...  .  (4.8) 

Properties  of  the  EMAl  process  are  given  by  Lawrance  and  Lewis  (1977) 

It  is  easy  to  see  that  we  can  make   E  _   in  (4.6)  a  random 

linear  combination  of   E.  ,   and   E.  „   to  get  an  EMA2  process,  and 

l-l        1-2 

can  continue  the  process  back   k   steps  to  obtain  an  EMAk  process. 
In  addition,  by  making   E.     autoregressive  over  the  previous   E  , 

1  — K  1 

we  obtain  a  mixed  kth  order  moving-average,  first  order  autoregres- 
sive process  which  we  denote  by  EARMA(l,k). 

4.1.3.  The  EARMA(1,1)  model.   Consider  explicitly  the  case   k  =  1. 
The  first  order  moving-average  and  first  order  autoregressive  pro- 
cess EARMA(1,1)  is  given  by 


with 


X.  =  BE.  +  U.A.  .  (4.9) 

i     11  l-l 


Ai-1  ■  pAi-2  +  Vi-1  <*-10> 
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f°r   '    1,  -,  3,  ...   and    A    =  K   .   This  sequence  of  random 
variables  is  not  Markovian. 

The  second-order  correlation  structure  of  the  process  is  given 
by 

POO  =  pk_1  c(3,p)  ,  (4.11) 

where 

c(6,p)  =  6(1-6)  +  p(l-B)(l-23)  (A. 12) 

The  point  process  whose  intervals  have  the  EARMA(l.l)  structure  is 
discussed  in  detail  in  Jacobs  and  Lewis  (1977a).   In  particular, 
for   6=1   it  is  a  Poisson  process.   The  process  is  very  simple  to 
generate  on  a  computer  and  is  very  useful  for  modelling  dependent 
sequences  in  queuing  systems.   It  is  possible  to  give  an  extenstion 
to  processes  in  which  the   X.   are  Gamma  distributed,  but  not  much 
beyond  this.   In  fact  a  necessary  condition  to  ensure  that  we  can 
find  an   £ .   in  the  fundamental  relationship  (4.2)  to  give  a 
specified  distribution   F(x)   for   X.   is  that   F(x)   be 
infinitely  divisible. 

We  discuss  now  a  possibly  broader  but  more  complex  model  for 
point  processes  having  a  specified  interval  distribution. 

4.1.4.   The  semi-Markov  generated  point  process  with  fixed  marginal 
distribution.   The  question  arises  as  to  whether  there  are  interval 
processes  ix.i      with  exponential  marginal  distributions  and 
ARMA(1,1)  second-order  correlation  structure  and  which  cover  a 
broader  range  of  correlation  than  the  EARMA(1,1)  process  (though 
perhaps  at  a  cost  of  more  complicated  structure) . 

We  discuss  briefly  one  such  process.   It  is  a  special  case  of 
the  semi-Markov  generated  point  process  introduced  by  Cox  (1963) 
and  extended  by  Haskell  and  Lewis  (1977) .   We  first  describe  the 
two-state  semi-Markov  generated  model.   In  this  model  there  are 
two  types  of  intervals  with  distributions   F  (x)   and   F  (x) , 
sampled  in  accordance  with  a  two-state  Markov  chain  for  which  the 
one-step  transition  matrix 
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1-a  a 


and 


■( 


1_a2  1_ai       \ 

n  >'  np  =       - —  ,  -^ — - —    .  (4.14) 

ai_0t2  l~a2  / 


When  we  form  the  point  process  we  assume  that  no  information  is 
available  about  the  type  of  interval,  i.e.,  that  in  the  actual  bi- 
variate  point  process  of  transitions  we  suppress  knowledge  of  the 
type  of  transition.   Then  the  distribution  of  an  interval  between 
transitions  (events)   X.   in  the  stationary  point  process  is 

Fx(x)  =  tt1F1(x)  +  7T2F2(x)  (4.15) 

and  the  correlation  between   X.   and   X, ,.   is 

1        i+k 

p(k)  =  MBk  ,     k  =  1,2,...  ,  (4.16) 

where   M  is  a  positive  constant  and   6  =  a  +  a  -  1  =  a  (1-a  ). 
Thus  the  correlation  structure  is  that  of  an  ARMA(1,1)  process.   For 
a  derivation  of  this  result  see  Cox  and  Lewis  (1966),  Ch .  7,  194-196. 
Lewis  and  Shedler  (1973)  use  this  process  to  model  the  page  exception 
process.   The  problem  is  to  deal  with  the  mixture  distribution 
(4.15)  for  the  marginal  distribution  of  intervals;  this  seems  to 
limit  the  utility  of  the  model. 

To  obtain  an  exponential  marginal  distribution,  consider  the 
following  device  (Jacobs  and  Lewis,  1977a).   Fix   x„ ,  where 
0  <  x   <  °°  ,  and  let 


JO 


F1(x)  = 


/   Ae"Audu 


-Ax, 


1  -  e 


0  <  x 


<-  x 


0  ' 


wo  * 


(4.17) 


F2(x)  = 


J   Ae    du 


-Ax, 


k0  » 


x  >  x 


0  * 


then   F  (x),  the  marginal  distribution  of  an  interval,  is  expo- 

X 

nential  (A)  if  we  set  v  =  1  -  exp(-Ax„).  There  is  one  degree 
of  freedom  left  in  the  matrix  £;  in  addition  to  A ,  we  have  free 
parameters  v  (or  x  )  and  a  .  What  then  is  the  range  of  3, 
and  can  it  be  negative? 

Straightforward  manipulation  shows  that 


(3 


*1  -  ai 


(4.18) 


which  lies  in  absolute  value  between  zero  and  one  but  can  be 
negative;  therefore  the  serial  correlations  can  be  negative.   Thus 
the  model  appears  to  be  broader  than  the  EARMA(1,1)  model.   The 
question  of  comparing  the  two  models  when   6   is  positive  has  not 
yet  been  explored;  it  requires  higher  order  interval  correlations, 
as  discussed  by  Brillinger  (1972)  . 

By  letting   F  (x)   and   F  (x)   be  a  partitioning  as  in  (4.17) 
of  any  specified  distribution   F(x)   we  obtain  a  point  process  whose 
marginal  interval  distribution  is  the  specified  distribution   F(x) 
(discrete,  continuous  or  mixed),  and  which  has  ARMA(1,1)  type 
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second-order  interval  spectrum  and  known  count  spectrum.   We  note 
that  there  is  a   choice  of  x        which  gives  a  geometrically  decaying 
p(k),  but  unlike  the  EAR1  process,  the  resulting  process  is  not 
Markovian. 

By  performing  the  same  type  of  truncation  on  an  n-state  semi- 
Markov  generated  point  process  (Haskell  and  Lewis,  1977)  it  is 
possible  to  obtain  a  point  process  with  specified  marginal  distri- 
bution, almost  any  ARMA-type  second-order  interval  correlation 
structure  (i.e.,  spectra  which  are  ratios  of  polynomials  in  cos  a)) 
and  known  count  spectrum,   In  fact  this  seems  to  be  the  only  point 
process  model  for  which  all  these  characteristics  are  known  and 
easily  computable.   Properties  of  this  model  have  not  yet  been 
fully  explored.   The  one  disadvantage  of  this  model  viz-a-viz  the 
exponential  models  described  in  the  previous  section  is  that,  since 
the  model  is  not  a  probabilistic  linear  combination  of  random  vari- 
ables, it  is  not  easy  to  relate  to  intuitive  considerations  when 
used  in  computer  system  models.   We  return  to  this  aspect  of  the 
modelling  below  when  we  discuss  multivariate  point  processes. 

4.2   Models  for  Counts 


It  is  not  always  possible  to  observe  the  exact  times  of  events 
in  a  point  process  and  in  fact, with  respect  to  computer  system  data, 
such  data  gathering  can  be  very  expensive.   What  is  more  usual  is 
to  observe  the  counts  of  events  in  successive  intervals  of  a  fixed 
length   A.   We  denote  the  differential  counts  of  events  in  succes- 
sive intervals  by   N.,  i  =  0,1,..,  .   To  model  the  {N.}  sequence 
we  need,  in  general,  models  for  dependent  sequences  of  positive 
valued,  discrete  random  variables.   Of  course  if  we  observe  a 
Poisson  process  the   {N.}   are  independent  and  Poisson  distributed, 
Otherwise,  we  know  of  no  model,  defined  in  terms  of  exact  occurrences 

of  events,  for  which  the  characteristics  of  the  N.   process  are 

l 

simple  or  known. 

The  modified  semi-Markov  generated  sequence  of  Section  4.1.4 
yields  a  simple  model  for  counts  by  letting   F(x)   be  a  discrete 
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distribution.   It  would  be  interesting  to  see  how  closely  we  can 
approximate  the  differential  count  process  of  a  Poisson  cluster 
process  this  way. 

An  even  simpler  model  for  counts  follows.   Its  main  drawback 
is  that ,  as  defined,  only  positive  correlations  are  representable. 

4.2.1.   The  discrete  mixed  autoregressiye-moving  average  PARMA (1,N+1) 

process.   Although  analogous  in  definition  to  the  EARMA  process, 

this  process  is  very  different  in  structure  and  much  broader.   Let 

the  sequences   { U . }   and   {V.}   be  as  above,  and   {E.}   be  an  i.i.d. 
M  1  1  l 

sequence  with  any  distribution   H(x).   Then  the  PARI  process  defined 
by 

N.  =  V.N.  ,  +  (l-V.)E. 
l     l  l-l        l   l 

is  a  first-order  Markov  process  in  which  the   N.   have  distribution 

l 

II(x).   Since  successive  values  of  the   N.   can  be  identical,  the 
model  is  useful  for  discrete  valued  processes  such  as  the  differ- 
ential count  process   {N 
transition  probabilities 


ential  count  process   {N.}.   The  process  is  a  Markov  chain  with 

i 


(1-p)  tt(£)     for   k  i   I, 


P(N   ,  =  fi.1  N.  =  k}  =  P(k,C)  = 
i+1      i 


(  P+  (1-p)  TT 


(I)      for   k  =  I. 

Observe  a  difference  from  the  usual  Markov  chain  modelling.   The 
marginal  distribution   TI(x)   of  the   N.   is  specified  first  and 
then  the  dependency  structure  is  specified  by  the  single  parameter 
p.   The  model  has  the  same  drawback  as  the  EAR1  model;  the  correla- 
tions are  all  positive,  although  this  is  not  an  enormous  drawback 
when  analyzing  sequences  of  positive  valued  random  variables. 

It  is  possible  to  generalize  the  model  to  give  a  mixed  moving- 
average  autoregressive  dependency  structure.   This  generalization 
is  the  PARMA(1,N+1)  model  in  Jacobs  and  Lewis  (1977b,  1977c)  defined 
as  follows: 

Let   {Y.}   be  a  sequence  of  independent  real  valued  random 
variables  having  a  common  distribution   tt  .   Let   {U  I   and   IvM 
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be  independent  sequences  of   {0,1}   random  variables  such  that 

P{U  =1 }  =  3    and    P{V  =1}  =  p  , 
i  i 

where   3   and   p   are  fixed  constants  with   0  £  3  <    1   and  0<  p  <  1 

Finally,  let   (S  }   be  a  sequence  of  independent  random  variables 
n  r 

taking  values  in   {0,1,..., N}   with  distribution   F,  where   N   is 
a  fixed  non-negative  integer.   Let 


where 


Xi  =  Vi-S.  +  (1-VAi-N-l  '    *-  ^  •••  ' 

1 

A.  =  V.A.  .  +  (l-V.)Y.  ,        i  -  -N,-N+l,...  . 
l    l  l-l       l   l 

Perhaps  the  most  interesting  characteristic  of  the  model  is 
that  if  we  transform  the  variables  N.,  the  resulting  process  has 
the  same  dependency  structure  as  the   {N.}   process.   This  is  because 

the  model  is  a  mixture  of  random  index  model  and  each   N.   is  a 

l 

randomly  chosen  member  of  the   {E, }   sequence.   This  model  therefore 
gives  the  ultimate  in  independence  of  the  marginal  distribution  and 
the  dependency  structure.   In  this  and  other  ways  it  is  very  much 
the  analog  of  the  normal  linear  processes. 

Although  we  have  introduced  the  DARMA(1,N+1)  sequence  as  a 
model  for  the  differential  count  process,  it  has  also  been  used  in 
Shedler  (1977)  to  model  sequences  of  event  marks  in  multivariate 
point  processes.   In  this  context,  event  types  generally  provide 
qualitative  information  about  the  multiprogrammed  processing  of 
jobs  (e.g.  job  start,  job  termination,  jobstream  identity)  whereas 
event  marks  provide  quantitative  workload  information. 

4.3.   Multivariate  Processes  and  Systems  Modelling. 

The  use  of  multivariate  point  process  models  in  computer 
system  evaluation  studies  is  quite  recent.   Hunter  and  Shedler 
(1977)  have  defined  particular  marked  multivariate  point  process 
models  and  used  them  for  the  prediction  of  response  times  in  multi- 
programmed  systems.   To  illustrate  another  approach,  we  discuss  use 
of  the  exponential  processes  EARMA(l,k)  to  model  a  single-server 


34 


first-come-first-served  queue  in  which  the  service  times  and  inter- 
arrival  times  have  exponential  marginal  distributions.   We  choose 
this  queueing  structure  for  simplicity  of  exposition;  it  illustrates 
the  power  of  the  random-linear  structure  of  tiie  EARMA(l,k)  model  in 
modelling  queues  with  dependence.   Moreover,  it  is  possible  to  use 
the  technique  to  incorporate  realistic  workload  characteristics 
into  networks  of  queues  used  as  models  for  the  structure  of  computer 
systems.   We  can  also  use  the  resulting  bivariate  process  of 
service  and  interarrival  times  as  a  model  for  a  bivariate  point 
process  or  a  highly  correlated  univariate  process  in  which  there 
are  quantitative  marks  associated  with  each  event. 

Let   S    denote  the  service  time  for  the  ith  arrival  at  the 
queue,  and   X^^   denote  the  time  between  arrival  of  the  ith  and 
(i-l)st  customer.   If  these  are  i.i.d.  exponential  random  variables 
with  parameters   A   and   a   respectively,  we  have  an  M/M/l  queue. 

Now  for   i  =  0,+l,  .,,  ,   let   IE. }   be  exponential  (A) 
and  independent,  and   {£. }   be  exponential  (a)  and  independent. 
In  addition  the   {E.}   and   {£.  }   sequences  are  mutually  indepen- 
dent.  We  want  a  queue  with  autocorrelated  and  cross-correlated 
service  and  arrival  times  such  that  it  gives  the  M/M/l  queue  as  a 
special  case,  and  proceed  as  follows. 

Let   (s.}   be  an  EARMA(q.k)  process  over   (E.,  T  £ ,  ~   E.    ,,...) 
1  l   A  *i*  A  wi-l 

where   q  =  0   or  1,  and   k  =  0,1,2,...  ,   Then  if   X.  =  £. ,  i  =  0, 
+1,  +2,  ...  ,  we  have  that   (S.)   is  EARMA(q,k)  and  also  cross- 
correlated  with   X.  =  £.;  although   {X,}   is  still  a  Poisson  pro- 

l     l  i 

cess,  {S.,X  }   is  a  bivariate  sequence  of  random  variables  with 
■t   i 

exponential  marginal  distributions. 

More  general  schemes  are  possible,  but  the  above  scheme  has 
the  following  simple  interpretation.   We  have  positive  correlation 
between   S.   and,  most  particularly,  the  previous   k   interarrival 
times.   If  the  £ . ,  and  consequently  the   X  ,  j  =  i  ,  i-1,...  ,  i-k-1, 

are  short,  then   S    will  tend  to  be  short.   Thus  this  scheme  models 

l 

the  case  where  the  server  tends  to  speed  up  if  the  queue  gets  long. 
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Investigation  of  such  schemes  in  simple  queueing  networks  is 
underway;  see  Jacobs  (1977).   In  particular  we  know  that  correlation 
does  affect  quantities  associated  with  the  queueing  networks. 
Specific  analytic  results  are  hard  to  obtain,  but  the  simplicity 
of  the  EARMA  models  makes  it  easy  to  simulate  the  queues. 

4.4.   Conclusions 

We  have  presented  in  this  section  a  number  of  models  for 
positive  valued  time  series  with  continuous  or  discrete  ranges 
which  should  be  useful  in  modelling  the  interval  or  differential 
count  processes  of  point  processes  which  occur  in  computer  systems. 
Although  the  models  are  not  motivated  by  an  underlying  physical 
structure,  they  have  simple  probabilistic  structure,  and  therefore 
should  be  convenient  in  modelling  and  simulating  computer  systems. 
Their  structural  simplicity  should  also  make  them  easier  to  fit 
to  data  than  most  standard  point  process  models.   In  particular, 
the  fact  that  the  specification  is  in  terms  of  easily  measured 
marginal  distributions  and  second  order  autocorrelation  properties 
should  make  rough  validation  and  fitting  quite  simple,   More 
detailed  statistical  methods  are  under  development;  see  Jacobs 
and  Lewis  (1977c).   Differentiating  among  related  models,  for 
example  the  three  models  having  exponential  marginal  distributions 
and  ARMA(1,1)  correlation  structure,  will  probably  entail  use  of 
higher  order  interval  and  count  spectra, 
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